Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Sixel support #3734

Draft
wants to merge 32 commits into
base: v2_develop
Choose a base branch
from
Draft

Conversation

tznind
Copy link
Collaborator

@tznind tznind commented Sep 9, 2024

Since windows terminal preview has sixel support I thought I'd spend a couple of evenings looking at how we might achieve it - see #1265 discussions for more info

I couldn't find any C# encoder libraries so was tinkering with writing one. The 2 steps are basically:

  • Work out what 256 color palette you will use
  • Encode the pixel data (using nearest color in palette)

I am struggling to understand how the pixel data writing works but have been leaning a lot on chat gpt and got something that kinda sorta works:

The big pixely image is the existing Images scenario outputs - 1 cell per pixel

The high res image in top left is a sixel encoded using the code in this branch.

image

The way I understand it is you write the data as 6 vertical pixels at once:

image

You specify the color you are painting in, lets say this color:

image

Then you specify the cells that it is to fill in, in this case lets say these 2 (in pink):

image

This gives you the following bitmask

       A sixel is a column of 6 pixels - with a width of 1 pixel

    Column controlled by one sixel character:
      [0]  - Bit 0 (top-most pixel)
      [1]  - Bit 1
      [1]  - Bit 2
      [0]  - Bit 3
      [0]  - Bit 4
      [0]  - Bit 5 (bottom-most pixel)

You turn that into a binary 011000 and then to an int and add 63. That turns into an ASCII character.

The bitmask is 011000, which in decimal is 24. Adding 63 (the base-64 offset) gives 24 + 63 = 87, which corresponds to the ASCII character W.

0x3F in hexadecimal corresponds to 63 in decimal, which is the ASCII value for the character ? [...] sixels use a base-64 encoding to represent pixel data.

Then theres logic I don't fully get around drawing multiple color layers over the top to build up the rest of the pixels that weren't the color you were drawing. To do that you have to output $ to 'rewind' to start of row and draw in the next color.

You switch color with #1 or #0 or whatever number of color you want from the palette

Fixes

  • Fixes #_____

Proposed Changes/Todos

  • Todo 1

Pull Request checklist:

  • I've named my PR in the form of "Fixes #issue. Terse description."
  • My code follows the style guidelines of Terminal.Gui - if you use Visual Studio, hit CTRL-K-D to automatically reformat your files before committing.
  • My code follows the Terminal.Gui library design guidelines
  • I ran dotnet test before commit
  • I have made corresponding changes to the API documentation (using /// style comments)
  • My changes generate no new warnings
  • I have checked my code and corrected any poor grammar or misspellings
  • I conducted basic QA to assure all features are working

@dodexahedron
Copy link
Collaborator

Hey I had a thought to bounce off of ya:

This seems like something that would be particularly well suited as a pluggable component, in whatever form that may take, be it a satellite assembly/module, dynamically loaded libraries in a configurable path, straight c# code compiled at run-time, or whatever.

In any of them, the glue would be essentially the same - an interface any plugin must implement, with them then being free to do whatever they like beyond that in their own code.

Any thoughts on that?

The work I'm doing on the drivers will make that kind of thing a lot easier for us to provide, since I'm pulling out interfaces for the public API.

@tznind
Copy link
Collaborator Author

tznind commented Sep 12, 2024

Could be an option 🤔. At the moment I am still in the exploration phase.

The driver level bit is basically

  • move to x,y
  • output sixel

Currently I'm doing this every render pass of driver which is very slow. But I'm not sure how much of that is down to the pixel encoded instructions being unoptimised.

There are 3 areas I'm working on at the moment

  • better color palette picking
  • understanding and optimising sixel pixel data encoding algorithm
  • integrating with driver

If outputting frames is just inherently slow then some kind of 'reserve area' method might be required to allow a single render to persist through multiple redrawing of main ui

But for now I think it is too early to think about plugin - it needs to work first!

Also down the line it might be nice to do more with GraphView e.g. output sixel if available or fallback to existing ASCII

@tznind
Copy link
Collaborator Author

tznind commented Sep 14, 2024

Looking good but for some reason the colors are off, specifically the dark colors. I thought at first it was the pixel encoding that was redrawing over itself with wrong colors or the palette was not having the dark colors or something but after completely replacing the pixel encoding bit I'm pretty sure that is now correct.

Maybe I can improve situation with some more buttons in scenario e.g. to view the palette used.

Test can be run on a sixel compatible terminal with

dotnet run -- Images -d NetDriver

Image encoding (one off cost) is slow, image rendering is relatively fast (but done every time you redraw screen).

I've included a few algorithms because I thought color issue was bad palette generation or bad color mapping. Might scale it back a bit or provide 1 fast implementation in core and the slow ones in UICatalog as examples.

Looking at this its also possible the color structs are off somewhere such that RGB is interpreted as ARGB and so the blue element is missing or something.

Also haven't explored dithering yet. Which seems to be another big area of sixel image synthesis.

shot-2024-09-14_05-12-41

@tznind
Copy link
Collaborator Author

tznind commented Sep 14, 2024

Success, bug was indeed just creating the image wrong at the start

Literally the first step in image generation and all because in TG the A is on the right instead of left of the arguments ><.

public static Color [,] ConvertToColorArray (Image<Rgba32> image)
{
    int width = image.Width;
    int height = image.Height;
    Color [,] colors = new Color [width, height];

    // Loop through each pixel and convert Rgba32 to Terminal.Gui color
    for (int x = 0; x < width; x++)
    {
        for (int y = 0; y < height; y++)
        {
            var pixel = image [x, y];
-            colors [x, y] = new Color (pixel.A, pixel.R, pixel.G, pixel.B); 
+            colors [x, y] = new Color (pixel.R, pixel.G, pixel.B); // Convert Rgba32 to Terminal.Gui color
        }
    }

    return colors;
}

image

@dodexahedron
Copy link
Collaborator

Have you tested out how it behaves if you've altered your terminal color settings/environment variables? Like...do you get predictably ruined colors, if in an indexed color mode, or does it do its best to try to force "correct" colors?

My assumption would be that the output will depend on color depth, with only ANSI or other indexed color schemes being subject to any silliness from that, and consoles capable of true color looking right no matter how ugly one's terminal color scheme may be. But that's just conjecture based on how I'd expect other things to work in most terminals without monkey business going on under the hood. 🤷‍♂️

@dodexahedron
Copy link
Collaborator

dodexahedron commented Sep 16, 2024

Success, bug was indeed just creating the image wrong at the start

Literally the first step in image generation and all because in TG the A is on the right instead of left of the arguments ><.

Color can be constructed as ARGB or RGBA. Just pass it the bytes as an int or uint, or if sixel exposes the raw value as an RGBA or ARGB value, use that directly for the constructor.

If it does not expose the whole 32-bit value and you can only get to the bytes, here is each way of doing it in one line and all on the stack:

// The int constructor is RGBA
new Color (BitConverter.ToInt32 ([pixel.R, pixel.G, pixel.B, pixel.A]));
// The uint constructor is ARGB
new Color (BitConverter.ToUInt32 ([pixel.A,pixel.R, pixel.G, pixel.B]));

I could add a direct implicit cast if you like, to make life easier while using it. 🤷‍♂️

IIRC, the uint vs int decision was based on the same or very similar design with System.Drawing.Color, for consistency.

@tznind
Copy link
Collaborator Author

tznind commented Sep 16, 2024

Color can be constructed as ARGB or RGBA

Yup, wasn't meaning that there was a problem with the constructor param order just that I made mistake right at the start and kept thinking issue was with palette generation.

sixel exposes the raw value as an RGBA or ARGB value

Sixel exposes RGB only (no A) and it is on a scale of 0-100. You can define up to x colors (typically 256) and those can be any RGB values you like.

For example

#0;2;100;0;0

The # indicates that we are declaring a color. The 0 is the index in the palette we are setting (i.e. the first color). The 2 indicates Type (RGB) - its basically always going to be a 2. Then you have RGB as 0-100 scaled.

So the above declares the color red (255,0,0) as palette entry 0.

The above text string is the the pure ASCII that you would output to the console when redering the sixel.

You use the palette colors when you encode the pixel data. Rendering pixels involves selecting a color index (from palette) then filling along the band (6 pixels high) with it. Then either 'rewinding' to start of band and drawing with a different color or moving to next band.

Have you tested out how it behaves if you've altered your terminal color settings/environment variables?

At this stage I am making something that works and then writing tests and documenting. I have tested in Windows Terminal Preview (the one that supports sixel) and ML Term (on linux).

There are some terminals that support limited palette sixel (e.g. 16 colors instead of 256). But I think generally if a terminal supports sixel it probably supports true color too.

Once it is done it can be tested for compatibility under corner cases like color setting changes. I think compatibility will have to be left to the user i.e. user can set a config value to support sixel rather than trying to dynamically detect it based on environment vars etc.

@dodexahedron
Copy link
Collaborator

Interesting.

As for the color values for conversion purposes, I'd suggest an extension method on Sixel colors in the spirit/convention of the common ToBlahBlah or FromBlahBlah methods color types often have. If there's value to you in doing that, of course.

I wouldn't suggest any modifications to Color that directly depend on any types from Sixel, for sake of separation and not building in too deep a dependency, though. 🤷‍♂️

@tznind
Copy link
Collaborator Author

tznind commented Sep 22, 2024

This is starting to come together.

I now have a pretty firm understanding of sixel encoding and can write tests explaining the expected encoded pixel data.

TODO:

  • more tests
  • screen positioning
  • sixel to screen coordinates measuring
  • potentially some resizing logic in UICatalog (I.e. to show how you ensure sixel doesn't spill out of a View)
  • refactor and finalise the 'out of the box' algorithms and move rest to UICatalog

So far I have not really touched the console drivers. Only hooking in via static to NetDriver - where outputting sixel encoded image is 2 lines of code (console move then console write).

Probably need some guidance on how best to implement in drivers once I've done the above

UnitTests/Drawing/SixelEncoderTests.cs Outdated Show resolved Hide resolved
Terminal.Gui/Drawing/Quant/CIE94ColorDistance.cs Outdated Show resolved Hide resolved
Terminal.Gui/ConsoleDrivers/NetDriver.cs Outdated Show resolved Hide resolved
@tznind tznind changed the title Sixel encoder tinkering Add Sixel support Sep 23, 2024
string expected = "\u001bP" // Start sixel sequence
+ "0;0;0" // Defaults for aspect ratio and grid size
+ "q" // Signals beginning of sixel image data
+ "\"1;1;12;2" // no scaling factors (1x1) and filling 12px width with 2 'sixel' height = 12 px high
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, those dimensions are both in pixels, so you're filling a background size of 12x2, not 12x12. It doesn't really make any difference in this case, because you're overwriting that background with 12x12 red pixels below, but the comment is wrong.

Copy link
Collaborator Author

@tznind tznind Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! seems like I should just remove these then? since they aren't doing anything and are defaults anyway (for the scaling).

The encoder will always draw all pixels as it's input is basically a bitmap with no alpha/transparency.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory you can leave off the raster attributes, but it's probably safer to include them. I know I've encountered terminal emulators in the past that couldn't handle sixel images if the size wasn't set (although that was several years ago).

Also if you do remove the, then you must set the first two parameters to 9 and 1, i.e. the sequence should start \u001bP9;1q. Because the default aspect ratio is actually 2:1 and the default background fill will cover the maximum extent of the page. Setting P1 to 9 gives you a 1:1 aspect ratio, and setting P2 to 1 will disable the background fill so you don't need to worry about the size.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those dimensions are both in pixels, so you're filling a background size of 12x2, not 12x12.

Thanks I have updated in 4571978.

@j4james do you know if there is a reference source for sixel online? I have been mostly working off other MIT open source libraries (node-sixel and libsixel) and the Wikipedia article and Iterm 2 docs

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best official documentation for Sixel is the VT340 Graphics Programming manual. You should find several web sites hosting copies of that if you search for EK-VT3XX-GP.

But if you're interested in edge cases not covered in the official docs, the best source of information is the vt340test repository. That has numerous test cases with screenshots of their output from a real VT340 terminal.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if you're looking for examples of other libraries which do Sixel encoding, these are a few that I'm aware of:

@tznind
Copy link
Collaborator Author

tznind commented Sep 28, 2024

Looks like you can detect sixel support by outputting the terminal querying code:

 "\u001B[c";

If the VT220-level reply contains a '4', sixel graphics are supported.

See this article on sixel

This will respond like this:

 [?61;4;6;7;14;21;22;23;24;28;32;42c

Does support sixel (latest Windows Terminal pre release)

For one that does not support sixel you will see:

 [?61;6;7;21;22;23;24;28;32;42c

Does not support sixel (regular windows terminal)

@tznind
Copy link
Collaborator Author

tznind commented Sep 28, 2024

@tig / @dodexahedron / @BDisp do we make use of this console querying system? (send escape code, read response). Is there anywhere in the code that does this kind of thing I could tap into?

@BDisp
Copy link
Collaborator

BDisp commented Sep 28, 2024

In the NetDriver I used in the Init method a line to read the terminal type. Actually I don't know at the moment.

@tznind
Copy link
Collaborator Author

tznind commented Sep 29, 2024

Ok nice looks like NetDriver is already all set up for sending and getting responses.

There is a Queue and then in the handler code it looks for the terminator to match up with outstanding requests.

Only odd thing is that it seems tied to mouse handling. I guess that was the first use case.

What I have implemented works (see image) but I think I should test on other consoles plus also we could look at how best to design this.

Basically you send
<esc>[c

And then you get the response elements and if you see a "4" anywhere in that array then it means sixel support.

<esc>[?61;4;6;7;14;21;22;23;24;28;32;42c
example response

@tig what is the future goal of NetDriver vs WindowsDriver? Are we planning on ditching ncurses driver? I'd rather avoid implementing too much into a driver that is not getting worked on and/or implementing the same thing twice/three times i.e. in each driver.

Changes are in
db0fc41

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants